A comparative analysis of progressive multiple sequence alignment approaches using UPGMA and neighbor joining based guide trees

نویسندگان

  • Ravi Kumar Yadav Dega
  • Gunes Ercal
چکیده

Multiple sequence alignment is increasingly important to bioinformatics, with several applications ranging from phylogenetic analyses to domain identification. There are several ways to perform multiple sequence alignment, an important way of which is the progressive alignment approach studied in this work. Progressive alignment involves three steps: find the distance between each pair of sequences; construct a guide tree based on the distance matrix; finally based on the guide tree align sequences using the concept of aligned profiles. Our contribution is in comparing two main methods of guide tree construction in terms of both efficiency and accuracy of the overall alignment: UPGMA and Neighbor Join methods. Our experimental results indicate that the Neighbor Join method is both more efficient in terms of performance and more accurate in terms of overall cost minimization.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evolving Better Multiple Sequence Alignments

Aligning multiple DNA or protein sequences is a fundamental step in the analyses of phylogeny, homology and molecular structure. Heuristic algorithms are applied because optimal multiple sequence alignment is prohibitively expensive. Heuristic alignment algorithms represent a practical trade-off between speed and accuracy, but they can be improved. We present EVALYN (EVolved ALYNments), a novel...

متن کامل

Improved Bayesian Phylogenetic Inference in a Statistical Alignment Framework Advanced Software Design for StatAlign

Long-term trends in computational phylogenetics show a steady transition of focus from traditional tree reconstruction methods towards Bayesian approaches. The early distance based techniques such as UPGMA and Neighbour Joining are today considered less accurate primarily due to the loss of information when condensing sequence data into a distance matrix. Maximum parsimony is fast but suffers f...

متن کامل

SoRT2: a tool for sorting genomes and reconstructing phylogenetic trees by reversals, generalized transpositions and translocations

SoRT(2) is a web server that allows the user to perform genome rearrangement analysis involving reversals, generalized transpositions and translocations (including fusions and fissions), and infer phylogenetic trees of genomes being considered based on their pairwise genome rearrangement distances. It takes as input two or more linear/circular multi-chromosomal gene (or synteny block) orders in...

متن کامل

PAAA: A Progressive Iterative Alignment Algorithm Based on Anchors

1-INTRODUCTION Multiple Sequence Alignment (MSA) is an important method to compare biological sequences. It consists in optimising the number of matches between the residues occurring in the same order in each sequence. MSA is an NP-complete problem [Wang and Jiang 94]. There are several approaches to solve this problem. Progressive approach is the most used and the most effective one, it opera...

متن کامل

Comparison of Phylogenetic Trees of Multiple Protein Sequence Alignment Methods

Multiple sequence alignment is a fundamental part in many bioinformatics applications such as phylogenetic analysis. Many alignment methods have been proposed. Each method gives a different result for the same data set, and consequently generates a different phylogenetic tree. Hence, the chosen alignment method affects the resulting tree. However in the literature, there is no evaluation of mul...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1509.03530  شماره 

صفحات  -

تاریخ انتشار 2015